Speech Summarization: An Approach through Word Extraction and a Method for Evaluation

نویسندگان

  • Chiori Hori
  • Sadaoki Furui
چکیده

In this paper, we propose a new method of automatic speech summarization for each utterance, where a set of words that maximizes a summarization score is extracted from automatic speech transcriptions. The summarization score indicates the appropriateness of summarized sentences. This extraction is achieved by using a dynamic programming technique according to a target summarization ratio. This ratio is the number of characters/words in the summarized sentence divided by the number of characters/words in the original sentence. The extracted set of words is then connected to build a summarized sentence. The summarization score consists of a word significance measure, linguistic likelihood, and a confidence measure. This paper also proposes a new method of measuring summarization accuracy based on a word network expressing manual summarization results. The summarization accuracy of each automatic summarization is calculated by comparing it with the most similar word string in the network. Japanese broadcast-news speech, transcribed using a large-vocabulary continuous-speech recognition (LVCSR) system, is summarized and evaluated using our proposed method with 20, 40, 60, 70 and 80% summarization ratios. Experimental results reveal that the proposed method can effectively extract relatively important information by removing redundant or irrelevant information. key words: speech summarization, sentence compaction, summarization score, dynamic programming, word network of manual summarization result, summarization accuracy

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

EXTRACTION-BASED TEXT SUMMARIZATION USING FUZZY ANALYSIS

Due to the explosive growth of the world-wide web, automatictext summarization has become an essential tool for web users. In this paperwe present a novel approach for creating text summaries. Using fuzzy logicand word-net, our model extracts the most relevant sentences from an originaldocument. The approach utilizes fuzzy measures and inference on theextracted textual information from the docu...

متن کامل

Evaluation method for automatic speech summarization

We have proposed an automatic speech summarization approach that extracts words from transcription results obtained by automatic speech recognition (ASR) systems. To numerically evaluate this approach, the automatic summarization results are compared with manual summarization generated by humans through word extraction. We have proposed three metrics, weighted word precision, word strings preci...

متن کامل

Evaluation Methods for Automatic Speech Summarization

We have proposed an automatic speech summarization approach that extracts words from transcription results obtained by automatic speech recognition (ASR) systems. To numerically evaluate this approach, the automatic summarization results are compared with manual summarization generated by human subjects through word extraction. We have proposed three metrics, weighted word precision, word strin...

متن کامل

A Study on Statistical Methods for Automatic Speech Summarization

This dissertation proposes a new automatic speech summarization method through word extraction. In this method, a set of words maximizing a summarization score indicating an appropriateness of summarization is extracted from automatically transcribed speech. This extraction is performed according to a target compression ratio using a dynamic programming technique sentence by sentence. The extra...

متن کامل

مقایسه روش‌های مختلف یادگیری ماشین در خلاصه‌سازی استخراجی گفتار به گفتار فارسی بدون استفاده از رونوشت

In this paper, extractive speech summarization using different machine learning algorithms was investigated. The task of Speech summarization deals with extracting important and salient segments from speech in order to access, search, extract and browse speech files easier and in a less costly manner. In this paper, a new method for speech summarization without using automatic speech recognitio...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

عنوان ژورنال:
  • IEICE Transactions

دوره 87-D  شماره 

صفحات  -

تاریخ انتشار 2004